- Tuesday, September 3, 2024
xAI has started the 100,000 Colossus H100 training cluster, the largest in the world, with plans to double it in size in a few months.
- Tuesday, July 23, 2024
xAI has activated a supercomputer cluster comprised of 100,000 Nvidia H100 GPUs in Memphis, Tennessee. It is effectively the most powerful AI training cluster in the market today. The supercomputer is expected to be used to train Grok, the company's large language model. xAI is estimated to have spent about $3 billion to $4 billion on the project. xAI also recently announced that it is hiring to increase its human talent advantage.
- Thursday, July 4, 2024
xAI's next model is slated for August. It reportedly took 20k H100s to train. Grok 3 is rumored for the end of the year and potentially requires 100k H100s.
- Wednesday, July 10, 2024
xAI has ended its deal with Oracle. It will build its own datacenter after Grok 2 training finishes. The company originally contracted for 24k H100s from Oracle.
- Thursday, June 13, 2024
Meta is upgrading its global data centers to support AI demands. It is planning to scale to 600,000 GPUs for AI training jobs. This involves innovative maintenance strategies and tools like OpsPlanner to ensure minimal disruptions and consistent performance while enabling rapid infrastructure scaling.
- Tuesday, September 17, 2024
Elon Musk's xAI is building "Colossus," the world's largest supercomputer, in Memphis to support its AI chatbot, Grok. The project has faced criticism for environmental oversight issues and its substantial energy and water demands. Despite these concerns, xAI aims to rapidly advance its AI capabilities while impacting the local community.
- Monday, May 27, 2024
xAI is planning to build a supercomputer to power the next version of its Grok AI chatbot. Elon Musk says he wants to get the proposed supercomputer running by the fall of 2025. xAI could be partnering with Oracle for the project. When completed, the supercomputer will be at least four times the size of the biggest GPU clusters that exist today. Musk estimates that running the Grok 3 model and beyond will require 100,000 Nvidia H100 chips.
- Monday, April 1, 2024
Microsoft and OpenAI are reportedly planning a joint data center project that could reach $100 billion in cost, culminating in the launch of a massive AI supercomputer named “Stargate” by 2028.
- Thursday, August 1, 2024
Elon Musk has announced that his AI company is training its Grok LLM using a powerful AI training cluster, aiming to create the most powerful AI by December. X is collecting user data by default to train Grok. This can be disabled via a setting only available on the web app. There are concerns about AI 'inbreeding' and chatbots reaching their peak amid growing discussions on AI data usage and potential regulatory impacts.
- Monday, April 1, 2024
xAI announced its next model, with 128k context length and improved reasoning capabilities. It excels at retrieval and programming.
- Thursday, August 8, 2024
Meta plans to significantly increase computing capacity for training its next-gen large language model, Llama 4, expecting a 10x compute increase over Llama 3. The investment in AI training infrastructure will drive up capital expenditures in 2025. Despite heavy spending, Meta doesn't foresee immediate significant revenue from Gen AI products.
- Thursday, April 4, 2024
AI infrastructure, underpinned by GPUs, specialized software, and cloud services, is essential for the deployment and scaling of AI technologies.
- Wednesday, April 10, 2024
Intel has announced its new Gaudi 3 AI processors, claiming up to 1.7X the training performance, 50% better inference, and 40% better efficiency than Nvidia's H100 processors at a lower cost.
- Wednesday, April 10, 2024
Intel has announced its new Gaudi 3 AI processors, claiming up to 1.7X the training performance, 50% better inference, and 40% better efficiency than Nvidia's H100 processors at a lower cost.
- Tuesday, October 1, 2024
Sam Altman, the CEO of OpenAI, is reportedly advocating for the Biden administration to support the establishment of a network of large-scale AI datacenters across the United States. These datacenters would each require up to five gigawatts of power, a significant amount that could potentially match the output of several nuclear reactors. The proposal emphasizes the importance of these facilities for national security and maintaining the U.S.'s technological edge over China. The plan suggests starting with one datacenter, with the possibility of expanding to five to seven in total. However, the construction of even a single facility poses substantial challenges, particularly in terms of power supply. The energy demands of these datacenters would necessitate power stations among the largest in the country, second only to the Grand Coulee hydro plant in Washington state. Currently, the energy landscape is strained, with many datacenter projects facing delays due to power shortages. Major cloud providers are already taking steps to secure energy sources, with Microsoft recently entering a long-term agreement to revive the Three Mile Island nuclear power plant. Similarly, Amazon has made moves to acquire access to significant power through its partnership with Talen Energy. In addition to power supply, sourcing enough advanced computing hardware, such as Nvidia's GPUs, is another hurdle. A datacenter of this scale could potentially house millions of GPUs, but the supply chain for these components is already under pressure. Nvidia's production capabilities are being closely monitored, as the demand for high-performance chips continues to rise. Altman's ambitious vision for AI infrastructure is not new; he has previously proposed large-scale projects, including a $7 trillion initiative to create a network of chip factories. While the current datacenter proposal may be seen as a way to prompt government investment in AI development, it also highlights the broader challenges facing the tech industry in scaling up infrastructure to meet growing demands. The conversation around these datacenters reflects a critical moment in the intersection of technology, energy, and national security, as the U.S. navigates its position in the global AI landscape.
- Friday, June 7, 2024
OpenAI has outlined the security architecture of its AI training supercomputers, emphasizing the protection of sensitive model weights and other assets using Azure-based infrastructure and Kubernetes for orchestration.
- Monday, April 15, 2024
xAI has announced that its latest flagship model has vision capabilities on par with (and in some cases exceeding) state-of-the-art models.
- Wednesday, May 15, 2024
Elon Musk's AI startup xAI is negotiating a potential $10 billion deal to rent cloud servers from Oracle, aiming to become one of Oracle's largest customers and rival AI offerings from OpenAI and Google.
- Wednesday, July 10, 2024
VC firm Andreessen Horowitz has secured thousands of AI chips, including Nvidia H100 GPUs, to dole out to its AI portfolio companies in exchange for equity.
- Tuesday, March 12, 2024
Cohere For AI has created a 30B+ parameter model that is quite adept at reasoning, summarization, and question answering in 10 languages.
- Tuesday, May 28, 2024
xAI has announced raising $6 billion in funding to help bring the startup's first products to market, build advanced infrastructure, and accelerate the research and development of future technologies. The funding comes from several sources, including Andreessen Horowitz, Sequoia Capital, and Saudi Arabian Prince Al Waleed bin Talal. Elon Musk plans to launch xAI's new data center by the fall of 2025. Musk has said he would prefer to build products outside of Tesla when it comes to AI and robotics unless he gets more control.
- Wednesday, October 2, 2024
Sam Altman, the chief executive of OpenAI, has embarked on an ambitious initiative aimed at significantly enhancing the computing power necessary for developing advanced artificial intelligence. His vision involves a multitrillion-dollar collaboration with investors from the United Arab Emirates, Asian chip manufacturers, and U.S. officials to establish new chip factories and data centers globally, including in the Middle East. This plan, while initially met with skepticism and regulatory concerns, has evolved into a broader strategy that includes building infrastructure in the United States to gain governmental support. The core of Altman's proposal is to create a vast network of data centers that would serve as a global reservoir of computing power dedicated to the next generation of AI. This initiative reflects the tech industry's commitment to accelerating AI development, which many believe could be as transformative as the Industrial Revolution. Although Altman initially sought investments amounting to trillions of dollars, he has since adjusted his target to hundreds of billions, focusing on garnering support from U.S. government officials by prioritizing domestic data center construction. OpenAI is also in discussions to raise $6.5 billion to support its operations, as its expenses currently exceed its revenue. The company is exploring partnerships with major tech firms and investors, including Microsoft, Nvidia, and Apple, to secure the necessary funding. Altman has drawn parallels between the proliferation of data centers and the historical spread of electricity, suggesting that as data center availability increases, innovative uses for AI will emerge. The plan includes the construction of chip-making plants, which can cost up to $43 billion each, aimed at reducing manufacturing costs for leading chip producers like Taiwan Semiconductor Manufacturing Company (TSMC). OpenAI has engaged in talks with TSMC and other chipmakers to facilitate this vision, while also considering the geopolitical implications of building such infrastructure in the UAE, given concerns about national security and potential Chinese influence. In addition to discussions in the Middle East, OpenAI has explored opportunities in Japan and Germany, proposing data centers powered by renewable energy sources. However, political pressures have led the company to refocus its efforts on the U.S. market. Altman has presented a study advocating for new data centers in the U.S., emphasizing their potential to drive re-industrialization and create jobs. As OpenAI navigates these complex discussions, it has bolstered its team with experienced policy advisors to enhance its infrastructure strategy. Altman remains aware of the competitive landscape, warning that the U.S. risks falling behind China in AI development if it does not collaborate with international partners. The ongoing dialogue between U.S. and Emirati officials underscores the importance of this initiative in shaping the future of AI technology.
- Friday, March 8, 2024
Answer AI has released a new FSDP/QLoRA training tool that makes it possible to train 70B parameter models on consumer GPUs. It has open sourced the code and made it easy to run locally or on runpod.
- Thursday, March 14, 2024
Together AI, a compute provider and AI research group, has raised additional funding in a round led by Salesforce Ventures and other top VCs. The company is growing 3x month over month.
- Friday, May 17, 2024
Hugging Face is committing $10 million in free shared GPUs to help developers, academics, and startups create new AI technologies, aiming to counteract the centralization of AI advancements dominated by tech giants.
- Thursday, April 25, 2024
FlexAI launched with $30 million in seed funding led by Alpha Intelligence Capital, Elaia Partners, and Heartcore Capital. The company is rearchitecting compute infrastructure to deliver universal AI compute: effective and seamless infrastructure needed to propel advancements in artificial intelligence. FlexAI's cloud service, launching later this year, enables developers to utilize heterogeneous compute architectures to build and train AI applications reliably and efficiently.
- Tuesday, September 24, 2024
OpenAI is starting a program for low and middle income countries to expand access to AI knowledge. It also has a professional translation of MMLU (a standard reasoning benchmark) in 15 different languages.
- Friday, June 7, 2024
Elon Musk redirected 12,000 H100 GPUs ordered by Tesla to X. He had told investors in April that Tesla had spent $1 billion on GPUs in the three months of the year. Tesla has been developing its own in-house supercomputer for AI, but Musk has previously said that it would be redundant if the company could source more H100s. X's order of 12,000 H100s will be redirected to Tesla.
- Elon Musk diverts Tesla GPUs to his other companies, impacting Tesla's AI supercomputer development.Wednesday, June 5, 2024
Elon Musk redirected 12,000 H100 GPUs ordered by Tesla to X. He had told investors in April that Tesla had spent $1 billion on GPUs in the three months of the year. Tesla has been developing its own in-house supercomputer for AI, but Musk has previously said that it would be redundant if the company could source more H100s. X's order of 12,000 H100s will be redirected to Tesla.
- Monday, April 22, 2024
NVIDIA will power Japan's new quantum supercomputer, ABCI-Q, alongside Fujitsu, integrating 2,000 NVIDIA H100 AI GPUs and CUDA-Q platform for quantum-classical computing applications. The project aims to advance Japan's capabilities in quantum computing and AI. This collaboration is part of a broader technological partnership between NVIDIA and Japan.